An experience on statistical machine translation between Spanish and the regional languages of Spain

نویسندگان

  • Mireia Farrús
  • Gonzalo Iglesias
  • Carlos Henríquez
  • Marc Poch
  • Roberto Muñoz
  • Nerea Ezeiza
  • Eduardo R. Banga
  • José B. Mariño
چکیده

Statistical machine translation systems between Spanish and other regional languages from Spain has become an interest of research during the last decade. However, regional languages are usually characterized by the lack of linguistic resources necessary to build such systems. This paper describes the development of three statistical machine translation systems between Spanish and three other languages: Galician, Catalan and Basque, focusing on the corpora used and the techniques applied in order to improve their performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

Overcoming statistical machine translation limitations: error analysis and proposed solutions for the Catalan-Spanish language pair

This work aims to improve an N-gram-based statistical machine translation system between the Catalan and Spanish languages, trained with an aligned Spanish– Catalan parallel corpus consisting of 1.7 million sentences taken from El Periódico M. Farrús (&) M. R. Costa-jussà J. B. Mariño M. Poch A. Hernández C. Henrı́quez J. A. R. Fonollosa TALP Research Center, Department of Signal Theory and Comm...

متن کامل

An Open-Source Shallow-Transfer Machine Translation Engine for the Romance Languages of Spain

We present the current status of development of an open-source shallow-transfer machine translation engine for the Romance languages of Spain (the main ones being Spanish, Catalan and Galician) as part of a larger government-funded project which includes non-Romance languages such as Basque and involving both universities and linguistic technology companies. The machine translation architecture...

متن کامل

Catalan-English Statistical Machine Translation without Parallel Corpus: Bridging through Spanish

This paper presents a full experiment on large-vocabulary Catalan-English statistical machine translation without an English-Catalan parallel corpus, in the context of the debates of the European Parliament. For this, we make use of an English-Spanish European Parliament Proceedings parallel corpus and a Spanish-Catalan general newspaper parallel corpus, both of which of more than 30 M words. G...

متن کامل

N-best Reordering in Statistical Machine Translation

As statistical machine translation (SMT) systems strive to improve the translation quality they are able to deliver, the word reordering problem is being unveiled as a major problem that must be addressed, whenever these systems are to be improved. While most works published focus their results in corpora involving English, Chinese and Arabic, such a translation problem can also be found within...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009